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Abstract.  In  signal  acquisition,  Toeplitz  and  circulant  matrices  are  widely 
used  as  sensing  operators.  They  correspond  to  discrete  convolutions  and  are 
easily  or  even  naturally  realized  in  various  applications.  For  compressive  sens¬ 
ing,  recent  work  has  used  random  Toeplitz  and  circulant  sensing  matrices  and 
proved  their  efficiency  in  theory,  by  computer  simulations,  as  well  as  through 
physical  optical  experiments.  Motivated  by  recent  work  [8],  we  propose  mod¬ 
els  to  learn  a  circulant  sensing  matrix/operator  for  one  and  higher  dimensional 
signals.  Given  the  dictionary  of  the  signal(s)  to  be  sensed,  the  learned  circulant 
sensing  matrix/operator  is  more  effective  than  a  randomly  generated  circulant 
sensing  matrix/operator,  and  even  slightly  so  than  a  (non-circulant)  Gaussian 
random  sensing  matrix.  In  addition,  by  exploiting  the  circulant  structure,  we 
improve  the  learning  from  the  patch  scale  in  [8]  to  the  much  large  image  scale. 
Furthermore,  we  test  learning  the  circulant  sensing  matrix/operator  and  the 
nonparametric  dictionary  altogether  and  obtain  even  better  performance.  We 
demonstrate  these  results  using  both  synthetic  sparse  signals  and  real  images. 


1.  Introduction.  Compressive  sensing  (CS)  ([7,  5])  acquires  a  compressible  signal 
from  a  small  number  of  linear  projections.  Let  x  denote  an  n-dimensional  real  or 
complex  vector  that  is  sparse  under  a  certain  basis  T,  i.e.,  one  can  write  x  =  T# 
with  a  sparse  6.  Let  b  :=  <frx  represent  a  set  of  m  linear  projections  of  x.  The  basis 
pursuit  problem 

BP:  min  ||%  s.t.  =  b,  (1) 

o 

as  well  as  several  other  methods,  has  been  known  to  return  a  sparse  vector  6  and 
thus  recover  x  =  T#  under  certain  conditions  on  the  sensing  matrix  and  basis  T. 

Gaussian  random  matrix  <f>  is  in  the  first  studied  CS  matrices  and  has  nice  prop¬ 
erties.  Given  enough  underdetermined  measurements,  (1)  with  Gaussian  random 
matrix  is  guaranteed  to  exactly  recover  a  sparse  x  with  high  probability.  How¬ 
ever,  in  many  CS  applications,  the  acquisition  of  the  linear  projections  Qx  requires 
a  physical  implementation.  In  most  cases,  the  use  of  an  i.i.d.  Gaussian  random 
matrix  is  either  impossible  or  overly  expensive.  This  motivates  the  study  of  eas¬ 
ily  implementable  CS  matrices.  Two  types  of  such  matrices  are  the  Toeplitz  and 
circulant  matrices,  which  have  been  shown  to  be  almost  as  effective  as  the  Gaussian 
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random  matrix  for  CS  encoding/decoding.  A  Toeplitz  matrix  T  has  the  same  ele¬ 
ment  on  each  diagonal,  and  a  circulant  matrix  C  is  a  special  Toeplitz  matrix  with 
each  row  obtained  by  shifting  the  preceeding  row  one  element  to  the  right,  namely, 


tn  tn  —  1  '  '  '  '  '  '  t\ 

tn  tn—  1  ’  '  ’  '  ’  ’  t\ 

tn+ 1  tn  tn  —  1 

t\  tn  tn  —  1 

tn-\-  2  tn-\- 1 

,c  = 

t2  t\  '  •  '  •  : 

tn  tn  —  1 

tn  tn  —  1 

_Un  — 1  '  '  '  tn-\- 2  tn-\- 1  tn 

tn  —  1  '  '  '  t2  ti  tn 

When  matrix  T  satisfies  the  additional  property  that  U  =  £n+^,Vi,  it  becomes  a 
circulant  matrix  C .  For  more  about  Toeplitz  and  circulant  matrices,  please  refer  to 
the  review  paper  [11].  Since  a  (partial1)  Toeplitz  matrix  has  very  similar  theoretical 
and  computational  properties  to  a  (partial)  circulant  matrix  of  the  same  size,  our 
discussions  below  are  based  exclusively  on  the  circulant  matrix.  Using  Toeplitz, 
rather  than  circulant,  matrices  will  incur  some  insignificant  computation  overhead 
to  the  methods  proposed  in  this  paper. 

1.1.  Circulant  Compressive  Sensing.  In  various  physical  domains,  it  is  easy  to 
realize  Cx  since  it  is  equivalent  to  the  discrete  convolution  c  *  x  for  a  certain  vec¬ 
tor  c;  if  P  is  a  row-selection  (downsampling)  operator,  PCx  becomes  circulant  CS 
measurements  of  x.  Using  either  Toeplitz  or  circulant  matrices,  Tropp  et  al.  [28]  de¬ 
scribes  a  random  filter  for  acquiring  a  signal  x\  Haupt  et  al.  [12]  describes  a  channel 
estimation  problem  to  identify  a  vector  x  (called  impulse  responses)  that  charac¬ 
terizes  a  discrete  linear  time-invariant  (LTI)  system;  Meng  et  al.  [20,  21]  applies  it 
to  high-resolution  OFDM  channel  estimation.  Random  convolutions  can  also  be 
applied  in  some  imaging  systems  in  which  convolutions  either  naturally  arise  or  can 
be  physically  realized  [3,  14,  18,  17,  25].  Furthermore,  random  convolutions  can  be 
realized  by  an  optical  correlator  [25].  Since  any  circulant  matrix  C  in  (2)  can  be 
diagonalized  by  a  Fourier  transform,  i.e.,  obeying 

C  =  FDF\  (3) 

where  F  is  the  discrete  Fourier  matrix  of  the  same  size  as  C  and  D  is  a  diagonal 
matrix,  implementing  Cx  is  equivalent  to  implementing  FDF*x ,  which  can  be  real¬ 
ized  through  optical  lens  and  a  static  spatial  light  modulator  (SLM)  [30].  Recently, 
the  use  of  Toeplitz  and  circulant  matrices  has  been  proposed  for  compressive  MR 
imaging  by  Liang  et  al.  [15]. 

There  are  rich  theoretical  results  on  circulant  matrices  for  CS.  In  [2],  Toeplitz 
measurement  matrices  are  constructed  with  i.i.d.  random  entries  of±lor{  — 1,0,1}; 
their  downsampling  effectively  selects  the  first  m  rows;  and  the  number  of  measure¬ 
ments  for  stable  recovery  is  shown  to  be  m  >  0{k 3  •  logn/fc),  where  k  is  the 
signal  sparsity.  The  work  [13]  selects  the  first  m  rows  of  a  Toeplitz  matrix  with 

1.1. d.  Bernoulli  or  Gaussian  entries  for  sparse  channel  estimation.  Their  scheme 
requires  m  >  0(k 2  •  logn)  for  stable  t\  recovery.  The  work  [21]  establishes  stable 
recovery  under  the  condition  m>  0(k2  log(n/fc)).  In  [25],  random  convolution  with 
either  random  downsampling  or  random  demodulation  is  proposed  and  studied.  It 
is  shown  that  the  resulting  measurement  matrix  is  incoherent  with  any  given  sparse 


1By  “a  partial  Toeplitz  matrix”,  we  refer  to  a  submatrix  formed  by  selecting  m  out  of  the 
n  rows  of  the  original  Toeplitz  matrix,  where  m  <  n.  Given  a  row  selection  operator  P  and  a 
Toeplitz  matrix  T,  PT  is  a  partial  Toeplitz  matrix. 
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basis  with  high  probability  and  £\  recovery  is  stable  given  m  >  0(k  •  logn  +  log3  n). 
Recent  results  in  [24]  show  that  several  random  circulant  matrices  satisfy  the  re¬ 
stricted  isometry  property  (RIP)  in  expectation  and  with  high  probability  given 
m  >  0(max{&;3/2  log3/2  n,  Hog2  Hog2  n})  with  arbitrary  downsampling,  and  thus 
guarantee  exact  and  stable  CS  recovery.  We  note  that  all  these  results  are  based 
on  random  circulant  matrices,  so  they  do  not  apply  to  optimized  circulant  matrices 
in  this  paper.  We  demonstrate  that  learned  circulant  matrices  achieve  even  better 
performance. 

The  use  of  circulant  sensing  matrices  also  allows  faster  signal  and  image  recovery. 
Practical  algorithms  (e.g.,  [29,  31,  32,  35,  33])  for  CS  are  based  on  performing 
operations  including  multiplications  involving  with  and  <f>*.  For  partial  circulant 
matrix  =  PC,  4>x  and  &*y  each  can  be  quickly  computed  by  two  fast  Fourier 
transforms  (FFTs)  and  simple  component- wise  operations.  This  is  much  cheaper 
than  multiplying  a  general  matrix  with  a  vector.  For  image  recovery,  a  splitting 
algorithm  taking  advantages  of  the  circulant  structure  has  been  proposed  in  [34]  and 
shows  both  satisfactory  recovery  quality  and  speed.  These  fast  algorithms  apply  to 
the  learned  circulant  matrices  in  this  paper. 


1.2.  Learning  Dictionaries  and  Sensing  Matrices.  In  CS,  signal  and  image 
reconstructions  are  based  on  how  they  are  sparsely  represented.  The  sparse  rep¬ 
resentation  involves  a  choice  of  dictionary,  a  set  of  elementary  signals  (or  atoms) 
used  to  sparsely  decompose  the  underlying  signal  or  image.  There  are  analytic 
dictionaries  and  learned  dictionaries.  Examples  of  analytic  dictionaries  include  the 
discrete  cosine  basis,  various  wavelets  bases,  as  well  as  tight  frames.  Some  of  them 
are  orthogonal  while  others  are  over-complete.  Their  analytic  properties  have  been 
studied,  and  they  feature  fast  implementations;  hence,  they  have  found  wide  ap¬ 
plications.  Properly  learned  (as  opposed  to  analytic)  bases  can  give  rise  to  even 
sparser  representations  of  signals  and,  in  particular,  images,  so  they  can  give  better 
encoding  and  decoding  performance  than  the  analytic  dictionaries;  see  [23,  10,  22] 
for  examples  and  explanations.  Although  most  theoretical  results  of  CS  recovery 
do  not  apply  to  learned  dictionaries  and  optimized  sensing  matrices,  one  useful  tool 
is  the  so-called  mutual  coherence  between  a  dictionary  T  and  a  sensing  matrix  4>: 
with  D  :=  4>T  =  [di, . . . ,  d#],  it  is  defined  as  [16] 


KD)  := 


\d*dj\ 

\\dMdj\W 


(4) 


A  smaller  /a(D)  tends  to  allow  more  T-sparse  signals  x  to  be  successfully  recovered 
from  measurements  4>x  via  various  CS  algorithms.  Hence,  Elad  [9]  seeks  means  to 
reduce  y(D).  His  work  demonstrates  improved  recovery  quality  with  learned  (non- 
circulant)  sensing  matrices  4>,  and  it  has  motivated  the  subsequent  work  [8],  which 
is  not  based  on  minimizing  y(D)  but  instead  pursuing  (4>T)*(4>T)  ~  /,  where  I 
denotes  the  identity  matrix.  Note  that  (4>T)*(4>T)  «  I  directly  targets  the  RIP  of 
4>T,  because  in  that  case,  any  kxk  leading  minor  of  (4>T)*(4>T)  also  approximates 
the  identity  matrix.  The  results  of  [8]  are  even  better  than  those  in  [9]  since  y(D) 
is  a  worst-case  characteristic  whereas  (4>T)*(4>T)  ~  I  tends  to  reduce  the  average 
of  off-diagonal  elements  of  (4>T)*(4>T)  and  improve  the  recovery  performance  in 
average.  Also,  the  latter  is  easier  to  compute. 


1.3.  Contributions.  We  are  motivated  by  [8]  and  propose  numerical  methods  to 
minimize  ||(4>T)*(4>T)  —  I ||_p,  where  4>  is  either  a  full  or  partial  circulant  matrix. 
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Specifically,  we  determine  the  spectrum  vector  d  =  diag(D)  of  =  PC  =  P(FDF *) 
by  numerically  optimizing  the  magnitudes  of  the  entries  in  d  and  then  assigning 
them  uniformly  random  phases.  Note  that  if  a  circulant  matrix  C  is  generated  by 
either  assigning  a  random  vector  as  its  first  row  or  following  Romberg’s  strategy  [25], 
then  the  corresponding  vector  d  has  uniformly  random  phases.  The  only  yet  key 
difference  among  the  different  approaches  lies  on  the  magnitudes  of  d.  Romberg  [25] 
assigns  a  unit  magnitude  uniformly  whereas  we  choose  to  optimize  the  magnitudes 
in  order  to  adapt  to  the  training  data. 

Like  [8],  we  also  learn  <F  and  T  together,  performing  the  so-called  coupled  learning 
of  <F  and  T.  While  results  of  [8]  are  limited  to  one  dimensional  case2,  we  take  ad¬ 
vantages  of  the  circulant  structure  and  deal  with  more  than  one  dimension,  enabling 
the  learned  circulant  sensing  for  signals  such  as  images  and  videos.  Furthermore, 
we  address  the  patch-scale  limitation  of  [8] ,  namely,  the  sensing  matrix  size  n  must 
equal  the  dictionary  atom  size;  namely,  if  the  dictionary  for  a  512  x  512  image  is 
formed  by  8  x  8  patches,  the  sensing  matrix  generated  in  [8]  has  only  64  =  8  x  8 
columns  and  applies  to  vectors  of  length  64,  instead  of  512  x  512.  Hence,  the  learned 
sensing  matrix  cannot  be  applied  to  the  entire  image.  We  remove  this  limitation  and 
perform  image-scale  learning  by  generating  circulant  sensing  operators  applicable 
to  the  entire  signals  or  images. 

Our  approaches  are  tested  on  synthetic  ID  signals,  as  well  as  the  images  in 
the  Berkeley  segmentation  dataset  [19].  The  learned  circulant  sensing  matrix  gives 
better  recoverability  over  both  random  circulant  and  Gaussian  random  matrices. 
For  real  image  tests,  the  coupled  learning  approach  achieves  even  better  recovery 
performance. 

1.4.  Notation.  We  let  (*)T  and  (•)*  denote  transpose  and  conjugate  transpose , 
respectively,  conj(-)  stands  for  the  conjugate  operator.  Define  A  •  B  =  JT  .  AijBij 
and  (A,B)  =  JT  ■  conj (Aij)Bij  for  any  two  matrices  A,  B  of  the  same  dimension, 
vec (A)  is  a  vector  formed  by  stacking  all  the  columns  of  A ,  and  diag(-)  is  defined  in 
the  same  way  as  MATLAB  function  diag,  which  either  extracts  the  diagonal  entries 
of  a  given  matrix  to  form  a  vector  or,  given  a  vector,  forms  a  diagonal  matrix  with 
the  vector’s  entries.  In  addition,  ©  and  0  denote  component- wise  multiplication 
and  division,  respectively. 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  overviews  one  and 
two  dimensional  circulant  correlations.  A  two-step  procedure  to  optimize  a  partial 
circulant  sensing  matrix/operator  is  described  in  section  3.  Section  5  describes  how 
the  involved  optimization  problems  are  solved.  Numerical  results  are  presented  in 
section  6. 


2.  Overview  of  Circulant  Correlations.  Given  a  kernel  v  G  Cn,  the  circulant 
correlation  x  a  v  for  a  vector  x  G  Cn  is  defined  as 


n 


(5) 


i= 1 


2 Strictly  speaking,  the  work  [8]  also  applies  to  two  dimensional  images.  However,  it  needs  to 
reshape  each  two  dimensional  image  patch  into  a  one  dimensional  vector. 
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where  ki  =  mod (n  +  z  — fc,  n)  + 1.  Let  c  =  [vn,  ,  tq]T.  Then  x*v  is  equivalent 

to  the  convolution  x  *  c,  which  is  defined  as 

n 

0r*c)fe  =  5>cfei,  for  k  =  1,...  ,ra, 

i=l 

where  ki  =  mod(n  +■  k  —  i  —  1,  n)  +  1. 

We  next  introduce  three  methods  to  compute  x*v.  First,  introducing  the  n  x  n 
cyclic  permutation  matrix 


0  0 
1  0 

o 


0 


0  1 
0  0 


0  0 
0  1  0 


we  can  write  formula  (5)  as 

{x  a  v)k  =  xT  P%~1v,  for  k  =  1, . . . ,  n. 


(6) 


Multiplying  Pn  from  the  left  circularly  down-shifts  by  a  row  and  multiplying 
from  the  right  circularly  right-shifts  a  column.  For  example,  given  v  =  [1,  2, 3] T 
and  x  =  [—1,  0, 1]T,  y  =  x  *  v  equals 


yx  =  xTPgv  =  —  l-l  +  0-  2  +  l-  3  =  2, 
y2  =  xTPsv  =  —  l-3  +  0-  l  +  l-  2  =  — 1, 
2/3  =  xTP£v  =  —  l-2  +  0-  3  +  l-  l  =  — 1. 


Secondly,  we  can  compute  x*v  via  the  matrix- vector  multiplication  Cx  with  the 
circulant  matrix 


Vi 

v2  •  • 

Vn 

Vl  • ' 

Vn—1 

V2 

v3  •  ■ 

Thirdly,  x  a  v  can  be  quickly  computed  by  two  fast  Fourier  transforms  and  some 
component-wise  multiplications  as  described  in  the  following  lemma,  which  is  a 
restatement  of  the  convolution  theorem. 


Lemma  2.1.  Any  circulant  matrix  C  in  (7)  can  be  written  as  C  =  FDF* ,  where 
F  is  the  n  x  n  unitary  discrete  Fourier  transform  matrix ,  i.e., 


Fij  =  f=  exp(-^AAli)  for  i  =  1  n;  j  =  1, . . . ,  n, 
\  n  n 


and  D  =  diag(d)  where  d  =  y/nFv. 


Remark  1.  The  matrix  C  is  real-valued  if  d  is  conjugate  symmetric ,  namely,  df  = 
conj {di>)  for  every  i  and  i'  =  mod(n  —  i  +  l,n)  +  1.  Imposing  conjugate  symmetry 
leads  to  real- valued  C  and  reduces  the  freedom  of  d  to  nearly  half. 


2D  circulant  correlation  inherits  the  nice  properties  of  ID  circulant  correlation. 
Given  a  kernel  M  E  CniXn2,  the  2D  circulant  correlation  Y  =  X  a  M  for  a  given 
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matrix  X  G  CniXn2  is  defined  by 


(X-kM)ts  =  's^2xijMtiSj,  for  1  <  t  <  m,  1  <  s  <  n2, 

h3 


(8) 


where  ti  —  mod(ni  +  i  —  t,n i)  +  1  and  Sj  =  mod(ri2  +  j  —  s,  77,2)  +  1. 

Similar  to  ID  circulant,  X*M  can  be  computed  in  three  ways.  First,  with  cyclic 
permutation  matrices  Pni  and  Pn2,  formula  (8)  can  be  written  as 

(X  *  M)ts  =  X  .  for  1  <  t  <  m,  1  <  a  <  n2. 

For  instance,  if 


1  -1 

-2  0 


X  = 

then  Y  =  X  *  M  has  components 

Y11  = 

F12  = 

Y13  = 


and 


M  = 


'  1 

-1 

0" 

T 

2 

3" 

=  -3, 

y2i  = 

'  1 

-1 

O' 

'4 

5 

6' 

-2 

0 

1 

• 

4 

5 

6 

-2 

0 

1 

• 

1 

2 

3 

'  1 

-1 

0" 

'3 

1 

2" 

=  -5, 

y22  = 

"  1 

-1 

O' 

'6 

4 

5' 

-2 

0 

1 

• 

6 

4 

5 

-2 

0 

1 

• 

3 

1 

2 

'  1 

-1 

0" 

"2 

3 

T 

=  -7, 

V23  = 

"  1 

-1 

O' 

'5 

6 

4' 

-2 

0 

1 

• 

5 

6 

4 

-2 

0 

1 

• 

2 

3 

1 

~  0, 


=  -2, 


=  -4. 


Secondly,  Y  =  X  *  M  can  be  computed  by  matrix- vector  multiplications  via 
vec (Y)  =  C2vec(X),  where 


C2  =  [vec(M1,:L)5  vec(M2,1)5 . . .  5  vec (Mni,1)5 . . .  5  vec(M1,s), 
. . . ,  vec (Mni’s), . . . ,  vec(M"1,n2)]T, 


(9) 


with  the  notation  :=  P^l~1M(Pj2)s~1  for  1  <  t  <  n\  and  1  <  s  <  It  is  not 
difficult  to  verify  that  C2  is  a  block-circulant  matrix.  For  the  above  example,  we 
have 


C2  = 


and  C2vec(X)  = 


"  1  " 

"-3" 

-2 

0 

-1 

-5 

0 

-2 

0 

-7 

1 

-4 

Thirdly,  we  can  quickly  compute  X  a  M  through  two  fast  2D  Fourier  transforms 
and  some  component-wise  multiplications  according  to  the  following  lemma,  which 
is  a  restatement  of  the  2D  convolution  theorem. 


Lemma  2.2.  Given  kernel  M  G  CniXn2;  define  2D  circulant  operator C2  onCniXri2 
by  C2(X)  =  X  a  M  for  X  G  CniXn2.  Then  C2  can  be  represented  as  C2  =  P2VPI, 
where  T2  is  the  orthogonal  2D  discrete  Fourier  transformation  operator  on  CniXn2 , 
Ttf  is  the  adjoint  operator  of  T2  as  well  as  its  inverse  operator,  and  V  is  an  operator 
on  CniXn2  defined  by  V(X)  =  V  0  X  for  any  X  G  CniXn2  with  V  =  (M). 

Remark  2.  The  block-circulant  matrix  C2  corresponding  to  C2  is  real  valued  if  V 
is  conjugate  symmetric ,  namely,  Vij  =  Vi>y  for  every  pair  of  i,j  and  if  =  mod(ni  — 
i  +  l,ni)  +  1,  j'  =  mod(n2  -  j  +  1,712)  +  1. 
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3.  Learning  Circulant  Sensing  Matrix/Operator.  In  this  section,  we  illus¬ 
trate  how  to  learn  the  ID  kernel  v  in  formula  (5)  and  the  2D  kernel  M  in  formula 
(8),  as  well  as  their  corresponding  downsampling  operators  P  (row  selection)  and  Pq 
(sample  selection);  hence,  we  form  a  partial  circulant  sensing  matrix  PC  G  Cmxn 
and  a  partial  2D  circulant  sensing  operator  TV7C2,  respectively.  Here,  P  selects  m 
out  of  the  n  rows  from  (7,  and  Vq  collects  the  entries  in  £2  C  {l,...,ni}x{l,...,n2} 
and  forms  a  column  vector.  For  example, 


X  = 


lead  to  Vq(X)  = 


1  2  3 

4  5  6 

7  8  9_ 

[4, 7, 2,  6]t. 


and 


n  =  {(1,2),  (2,1),  (3,1),  (2, 3)}, 


Remark  3.  In  our  experiments,  we  will  normalize  each  column  of  the  sensing 
matrix  <f>,  for  otherwise  the  recovery  can  be  numerically  instable  in  particular  when 
is  a  Gaussian  random  matrix.  Note  that  the  normalization  to  PC  (or  TV7C2)  is 
equivalent  to  multiplying  a  diagonal  matrix  (or  having  a  componentwise  operator) 
to  the  right,  and  thus  it  only  slightly  increases  the  computation. 

3.1.  Learning  ID  circulant  kernel  and  downsampler.  Let  T  G  CnxK  be  a 

given  dictionary  such  that  the  underlying  signal  x  =  4/0,  where  0  is  a  (nearly) 
sparse  vector.  Following  [8],  we  would  like  to  design  a  partial  circulant  sensing 
matrix  =  PC  such  that  ps  /,  where  C  is  an  n  x  n  circulant  matrix  and 

P  is  an  m  x  n  downsampling  matrix.  To  construct  a  partial  circulant  matrix  <F, 
we  shall  determine  P  and  C .  We  first  learn  the  circulant  matrix  C  and  then  the 
downsampler  P. 

According  to  Lemma  2.1,  C  =  FDF*  where  the  diagonal  matrix  D  =  diag(d) 
with  entries  d  =  yJnFv  and  v  is  the  kernel  of  C.  Therefore,  learning  C  is  equivalent 
to  choosing  the  best  kernel  v  or  vector  d.  Our  approach  is  based  on  solving 

rmn||T*C*CT-/||F,  (10) 

so  that  4/*(7*(7T  /.  From  F*F  =  /,  we  have 

=  T*Fdiag(d*)diag(d)F*T. 

Introduce  B  :=  which  satisfies  F>*  =  B.  Let  B  :=  [|F?^|2].  Given  T, 

matrices  B  and  B  are  constant  matrices.  Let 

z  :=  (|<fi|2,  |d2|2, . . . ,  \dn |2)T  >  0,  (11) 

which  are  unknowns  to  be  determined.  We  have  diag(d*)diag(d)  =  diag(z)  and 

^*c*c^  -  i\\2f 

=  ^\\y*Fdmg(d*)dmg(d)F*V-I\\2F 

1  77 

=  -tr(«'*Fdiag(z)Bdiag(^)F*^)  -  tr(^*Fdmg(z)F*^)  +  - 

£  Za 

1  n 

=  -ti{dmg{z)Bdmg{z)B)  —  tr(diag(z)R)  +  — 

=  ^zTdiag(Hdiag(z)F>)  —  zTdiag(H)  +  ^ 

=  ^zTBz-zTdmg(B)  +  T^. 
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Hence,  we  reduce  (10)  to 

min  -ztBz  —  zTdiag(P).  (12) 

z>o  2 

Problem  (12)  is  a  convex  quadratic  program  and  can  be  reformulated  as  a  non¬ 
negative  least-squares  problem. 

Given  a  solution  z  =  zopt  to  (12),  any  d  obeying  (11)  gives  C  =  Fdiag(d)F*  as 
a  solution  to  (10).  Since  only  |d*|,  i  =  1, . . . ,  n,  are  specified,  the  phases  of  di  are 
still  subject  to  determine.  Generally,  phases  encode  the  location  of  the  information 
in  a  signal.  Since  such  location  is  unknown  at  the  time  of  sensing,  we  choose  to 
generate  the  phase  of  every  di  uniformly  at  random,  which  is  suggested  by  [25] .  We 
want  to  mention  that  if  T  is  an  orthogonal  basis  or  tight  frame,  i.e. ,  4/4/*  =  /,  the 
solution  of  (12)  is  simply  a  vector  with  all  ones,  and  in  this  case  our  generated  d 
is  the  same  as  that  in  [25].  However,  as  TT*  ^  /,  e.g.,  all  the  learned  dictionaries 
in  our  experiments,  the  solution  of  (12)  is  generally  not  the  all-one  vector,  and  the 
optimized  d  is  different  from  that  in  [25].  Models  (10)  and  (12)  lead  to  significant 
improvement  in  sensing  efficiency  as  we  shall  demonstrate  by  numerical  examples 
in  section  6. 

Remark  4.  The  above  procedure  generates  a  complex- valued  C.  To  obtain  a  real¬ 
valued  (7,  one  shall  add  constraints  Zi  =  zn_i+2  for  i  =  2, . . . ,  n  in  the  problem  (12) 
and  then  generate  a  conjugate  symmetric  d  from  the  solution  zopt. 

As  4/  is  square,  i.e.,  n  =  K,  we  also  tested  an  alternative  model  as  follows. 

Though  it  does  not  work  as  well  as  (10),  we  believe  it  is  worth  conveying  the 

information  to  the  readers.  The  model  is 

imn  ^||C,'3/  -  I\\2F,  (13) 

which  is  equivalent  to 

min  -dT Ad  -  LTdiag(F*'I'F)  -  -conj(Ydiag(F*^F)),  (14) 

2  2  2 

where  A  is  a  diagonal  matrix  with  diagonal  entries  given  by  vector  diag (F*T4/*F). 
Since  (14)  is  component-wise  separable  in  di,  it  is  easy  to  derive  its  closed- form 
solution 

dopt  =  conj(diag(F*TF))  0  diag (F*TT*F).  (15) 

Our  numerical  experience  suggests  that  this  solution  (and  thus  model  (13))  is  not 
as  effective  as  that  of  (10).  On  the  other  hand,  4/  is  usually  overcomplete,  for  which 
case  (13)  is  inapplicable. 

After  the  circulant  matrix  C  is  determined,  we  now  optimize  P,  and  thus  fully 
determine  4>.  A  simple  yet  effective  solution  is  the  random  selection,  that  is,  let  P 
select  m  out  of  the  n  rows  of  C  uniformly  at  random.  This  tends  to  work  well  on 
signals  without  dominating  frequencies. 

We  can  also  choose  to  minimize  ||4/*4>*4>4/  —  I\\f  so  that  4/*4>*4>T  «  I.  Let  pi 
be  the  binary  variable  indicating  the  selection  of  element  i,  i.e., 


Pi  = 


1,  matrix  P  selects  row  i, 
0,  otherwise. 


(16) 
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Since  $  =  PG,  we  shall  solve 


nhn-||T*G*(P*P)GT-/|||  = 


^PiiQiQi)  ~  I 


i= 1 


(17) 


where  qi  is  the  i-th  column  of  T*G*.  Model  (17)  is  equivalent  to  the  binary  quadratic 
program 

f  minp||[vec(gigj),---  ,vec(qnq^)\p  -  vec(I)\\22  , 

[  subject  to  ^2iPi  =  m  and  pi  G  {0, 1},  Vi. 

Here,  [vec(gig^),  •  •  •  ,  vec(gng*)]  is  a  matrix  of  n2  rows  and  n  columns.  By  simple 
calculation,  we  can  write  (18)  into  the  following  equivalent  problem 


min pT Hp  —  2 /Tp,  subject  to  7  pi  —  m  and  pi  G  {0, 1},  Vi,  (19) 

i 

where  H  =  [|G^-|2],  /  =  diag(G)  and  G  =  GT4PG*.  Problem  (19)  is  NP-hard  in 
general  and  can  be  solved  to  a  moderate  size  by  solvers  such  as  Gurobi  [4]. 


3.2.  Learning  2D  circulant  kernel  and  downsampler.  A  2D  circulant  opera¬ 
tor  C2  is  often  used  in  imaging  [26,  30].  It  is  not  equivalent  to  a  circulant  matrix 
even  if  the  image  is  vectorized  but  can  be  reduced  to  a  block  circulant  matrix  with 
circulant  blocks  [6]. 

Given  a  dictionary  T  =  [Vq, ^2?  •  •  •  where  each  ^  G  CniXn2,  we  define  a 
linear  operator  Q  on  CK  by  Q(0)  =  1  Oi'&i  f°r  0  G  CK .  An  array  X  G  CniXn2 

is  sparse  with  respect  to  the  dictionary  T  if  it  can  be  represented  as  X  =  Q(6) 
with  a  sparse  0  G  CK .  Let  Vq  be  the  downsampling  operator  on  CniXn2  defined  at 
the  beginning  of  this  section.  Then  the  basis  pursuit  problem  to  recover  X  can  be 
written  as 

min  ||0||i,  subject  to  PqC2Q(0)  =  b,  (20) 

0 

where  b  =  P^C2(X)  contains  the  measurements  of  X.  To  improve  the  recoverability 
of  the  i\  optimization  problem  (20),  we  shall  try  to  make  (C2Q)*C2Q  close  to  the 
identity  operator  X  on  CK .  The  adjoint  operator  of  Q  is  defined  as 

Q*(X)  =  [(ipi,X),  (fcl),...,  (fe,I)]T,  for  any  l6Pxn!. 


Hence,  for  any  0  G  CK, 


(C2Q)*C2Q(0) 


K 


K 


T 


0iWi)>>  •  •  • ,  (w*),  5>Wi)>  , 

i=l  i—  1 


and  making  (C2Q)*C2Q  ss  X  is  equivalent  to  making  (C2(^),  Ylf=i  O&i'&i))  ~  0j 
for  1  <  j  <  K.  Toward  this  goal,  we  solve 

min||G-/|||,  subject  to  Gts  =  (C2(^t), C2(^s)),  (21) 

Gr 


which  can  simplified  as  follows.  According  to  Lemma  2.2,  we  have 


<C2  WAGM  =  <j-2v.f2*WA  j-2vjj(^)) 
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Let  V  =  conj(V)©V  and  define  U.  :=  V*V,  where  V*  is  the  adjoint  operator  of  V.  We 
have  U{X)  —UQX  for  any  X  E  CniXn2.  Let  u  =  vec (U)  and  Y  =  [y1^2, . . .  ,yK] 
with  ys  =  vec(Jr|('0s))  for  s  ■=  1,  2, . . . ,  K.  Then 

Gts  =  =  («©i/*)V  =  w*(conj(j/t)  ®ys). 

Hence, 

iigHf  =  Econj  (y*)  ■ Gts 

t,s 

=  X]conj  ©(conj(?/)  Qys))  •  (u*(conj(y*)  ©  ys)) 

t,S 

=  YU*  ((conj(y*)  ©J/s)(conj(yt)  Qys))*u 

t,S 

=  Yu*  (conj  y  ynwn) « = 

t,s 

where  A  :=  3  conj (yl  (yt)*)(ys  (ysY)  =  conj  (FT*)  ©  (YY*).  In  addition, 

tr(G)  =  tr (G*)  =  ^u*(conj (/)  ©  y*)  =  (diag(Fr*))*u  =  f*u , 

t 

with  /  :=  diag(FF*).  Therefore,  problem  (21)  is  equivalent  to  the  convex  quadratic 
program 

min  ^-rtAr  —  /tr  (22) 

u>o  2 

in  the  same  form  of  (12),  where  we  have  used  real  transpose  instead  of  conjugate 
transpose  since  /,  u  and  A  are  all  real. 

Given  a  solution  u  =  R0pt  of  (22),  we  can  obtain  matrix  U  via  u  =  vec (U).  Any 
V  satisfying  U  =  conj(F)  ©  V  defines  C2  =  J-2VJ-2  as  a  solution  to  (21).  Like  done 
in  Section  3.1  for  ID  circulant,  we  choose  to  generate  the  phase  of  each  entry  Vts 
uniformly  at  random. 


Remark  5.  The  above  procedure  gives  a  complex- valued  C2.  To  obtain  a  real¬ 
valued  C2,  we  can  add  constraints  R(ni-4)j+i£  =  R(m-i )j'+i'  f°r  every  pair  of  i,  j  and 
i'  =  mod(ni  —  i  +  1,  ni)  +  1,  j'  =  mod(ri2  —  j  +  1,  ri2)  +  1  to  (22),  and  then  generate 
a  conjugate  symmetric  V  from  the  solution  izopt. 

Given  C2,  we  can  choose  H  uniformly  at  random  or  optimize  Vq  such  that 
{VqC2  Q)  *(VqC2Q)  ~  X,  which  can  be  achieved  by  solving 

min  ||G  -  subject  to  Gts  =  {VnC2{‘ipt),'PnC2 (VU)-  (23) 

Gr 


Let  n  =  ni 77,2  and  Dq  G  Cnxn  be  a  diagonal  matrix  with  diagonal  entries  defined 
by 


(Dn)k 


1  kon 


if  (i,  j)  e  Q, 
otherwise, 


for  1  <  i  <  ni,  1  <  j  <  n 2, 


(24) 


where  %  =  i  +  ( j  -  l)nx.  Then  (Dnvec(X),  Dnvec(Y))  =  (Vn X,VqY)  for  any 
X,Y  E  CniXn2.  Letting  ?/  =  vec^WO)  for  f  =  1, . . . ,  K,  we  have 


Gts  =  (VaC2(ilH),'PnC2{‘il>.))  =  {Dny\Dnys)  =  (y*)*A,y8. 
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Let  Y  =  ^2s  ys(ys)*.  Then 

\\G\\f  =  T]conj (G) -Gts  =  E  i^y^y8)  •  (( y ’T-tW) 

t,s  t,s 

=  ^(yt)*(Dn^n)yt  =  £>  (AAA22/V)*)  =  tr (DnYDnY) 

t  t 

=  Y,(DnYDQY)u  =  £(A,y)«  •  (D^Y)n  =  £(Dn)«  •  ^  •  (Da)jj  •  L* 

i  hi  hi 

=  y~X-Dn)it  •  y»j  •  conj(lij)  •  ( Du)jj  =  pT Hp, 
hi 

where  H  =  [|Y^-|2]  and  p  =  diag(D^).  In  addition, 

tr(G)  =  =  E(h)*AV  =  J>(lW(yT)  =  tr(AA)  =  fTp, 

t  t  t 

where  /  =  diag(T).  Hence,  (23)  is  equivalent  to 

min  -pT Hp  —  fTp ,  subject  to  pk  =  \Q\  and  pk  G  {0, 1},  Vfc.  (25) 

k 

4.  Image-scale  dictionary.  For  natural  images,  current  dictionary  learning  meth¬ 
ods,  e.g.,  KSVD  [1],  train  a  dictionary  T°  from  a  set  of  image  patches.  Hence,  the 
dictionary  atoms  have  the  same  size  as  patches.  In  practice,  we  often  take  measure¬ 
ments  directly  from  an  entire  image  X  instead  of  its  small  patches,  so  the  learned 
dictionary  T°  does  not  match  the  image  X.  How  can  we  use  T°  to  recover  X 
from  the  directly  taken  measurements?  One  approach  is  to  make  an  image-scale 
dictionary  under  which  X  is  sparse. 

Given  a  patch-scale  atom,  we  construct  a  list  of  image-scale  atoms  that  have 
no  common  non-zero  parts  by  placing  the  patch-scale  atom  at  each  possible  place 
in  the  image.  Special  cares  are  given  near  the  boundary  where  only  part  of  the 
patch-scale  atom  is  used.  The  precise  treatment  is  specified  below. 

Let  X  G  C^1*^2  be  an  image,  and  assume  that  each  of  its  small  patches  x  G 
(Cni  Xn2  5  where  n\  <  N\  and  n 2  <  N2,  has  a  sparse  representation  under  dictionary 
T°  =  [i/q, . . . ,  'ipx\-  We  want  to  construct  an  image-scale  dictionary  T'  from  T° 
such  that  X  is  sparse  under  Tg  Let  nr  =  [^\,nc  =  |_^J,  sr  =  Ni  —  nrni,sc  = 
N2  —ncri2,  and  N  =  nrnc  +  sign(sr)nc  +  sign(sc)nr  +  sign(sr5c).  For  each  atom  7/^, 
we  form  N  image-scale  atoms  . . . ,  G  C^1*^2  as  follows. 

•  For  j  =ss  1, . . . ,  nc  and  i  =  1, . . . ,  nr ,  let  t  =  (j  —  1  )nr  +  i  and  be  the  matrix 
with  all  elements  being  zero  except  having  ^  on  the  submatrix  indexed  by 
the  (nr(i  —  1)  +  l)th  through  (nri)th  row  and  the  (nc(j  —  1)  +  l)th  through 
(ncj)th  column. 

•  If  sr  7^  0,  let  t  m  nrnc  +  j  and  be  the  matrix  with  all  elements  being 
zero  except  having  the  last  sr  rows  of  ^/c  on  the  submatrix  indexed  by  the 
(nrni  +  l)th  through  TVith  row  and  the  ( nc(j  —  1)  +  l)th  through  (ncj) th 
column  for  j  =  1, . . . ,  nc. 

•  If  sc  7^  0,  nr  more  atoms  are  formed  in  a  similar  way  using  the  last  sc  columns 

of  Vtfc- 

•  If  scsr  7^  0,  we  form  the  last  atom  with  all  elements  being  zero  except 
having  the  right-bottom  sr  x  sc  submatrix  of  ipk  on  its  right-bottom  sr  x  sc 
submatrix. 
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Figure  1.  From  a  small  patch  -0,  we  form  N  image-scale  atoms 
Ti,4/2,...,Ttv  that  have  no  common  non-zero  parts  by  placing  ^ 
at  each  possible  place  in  the  image-size  frame.  Special  cares  are 
given  near  the  boundary  where  only  part  of  the  patch-scale  atom 
is  used. 


*1 

$2 

^nr 

^nr(nc- 1)  +  1 

0 

0 

0 

0 

0 


Note  that  any  two  atoms  generated  from  'ipk  have  no  common  non-zero  parts.  The 
image  X  has  a  sparse  representation  under  ^  since  each  patch  of  X  is  sparse  under 
T°.  Figure  1  illustrates  how  we  form  N  image-scale  atoms  Ti,T2,...,Ttv  from  a 
small  atom  i/j.  In  section  6,  we  will  show  that  this  generated  image-scale  dictionary 
works  well  in  recovering  an  entire  image  from  its  linear  measurements. 

5.  Algorithm  and  Implementation.  With  a  given  dictionary  T,  Algorithm  1 
outlines  our  approach  for  optimizing  a  partial  circulant  matrix  4>. 


Algorithm  1 

1:  Solve  (12)  for  z,  generate  randomly-phased  d  from  z  via  (11),  and  then  form  C  — 
Fdiag(d)F*. 

2:  Solve  (19)  for  p  or  randomly  generate  p  and  then  form  P  from  p  via  (16). 

3:  Generate  4>  =  PC. 


For  2D,  an  optimized  partial  circulant  operator  VqC^  can  be  obtained  in  a  similar 
way.  At  Step  1  of  Algorithm  1,  we  use  the  MATLAB  function  quadprog  to  solve 
(12)  with  its  default  settings.  At  Step  2,  we  use  the  commercial  code  Gurobi  [4] 
with  MATLAB  interface  [36]  to  solve  the  binary  program  (19).  In  our  test,  the 
maximum  number  of  iterations  was  set  to  2000.  Usually,  Gurobi  terminated  at  the 
maximum  number  of  iterations  with  the  best  solution  obtained.  Hence,  (19)  was 
only  approximately  solved. 

We  used  both  synthetic  and  real  data  for  test.  For  synthetic  data,  T  was  Gaussian 
randomly  generated  basis  or  discrete  Fourier  basis,  and  4>  was  learned  by  Algorithm 
1 .  For  real  data,  T  was  learned  in  either  an  uncoupled  way  or  a  coupled  way.  In  the 
uncoupled  test,  we  first  learned  T  from  a  set  of  training  data  X  =  [Xi, . . . ,  Xl\  E 
CnxL  by  applying  KSVD  [1]  to 

min  ||  A  —  4>0||f.,  subject  to  ||@i||o  <  S',  Vi 


(26) 
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for  a  given  sparsity  level  S',  where  0  =  [@i, . . . ,  0^]  and  ||0i||o  denotes  the  number 
of  non-zeros  of  0^.  Then  we  learned  <f>  by  Algorithm  1.  In  the  coupled  test,  we 
simultaneously  learned  a  pair  of  T  and  <f>  in  a  way  similar  to  [8] .  Specifically,  during 
each  loop  of  the  coupled  algorithm,  we  first  compute  via  Algorithm  1  with  a  fixed 
T,  then  update  0  by  applying  OMP  [27]  to 

mma\\X  -  tf0|||,  +  \\$X  -  $tf0|||.,  subject  to  ||©i ||0  <  S,  (27) 

where  a  >  0  is  a  weight  parameter,  and  finally  update  'I'  and  0  via  solving  (27)  with 
respect  to  T  and  the  current  nonzero  entries  of  0  jointly.  Algorithm  2  summarizes 
our  coupledly  learning  method,  and  in  our  experiments  we  terminated  the  algorithm 
if  the  relative  changes  of  \\X  —  T0||^  is  less  than  10-6. 


Algorithm  2 

1:  Initialization:  randomly  generate  T  or  set  T  as  some  learned  dictionary. 

2:  while  Not  converged  do 

3:  Compute  via  Algorithm  1  with  T  fixed; 

4:  Update  0  by  solving  (27); 

5:  Update  T  and  0  jointly  by  solving  (27)  with  <I>  and  the  nonzero  locations  of  0  fixed. 

6:  end  while 


After  obtaining  T  and  4>,  we  used  YALL1  [32]  (version  1.4)  to  recover  the  sparse 
signal  6  via  solving 

min\\e\\1  +  -\\^m-b\\l  (28) 

6  p 

where  b  =  4>(T<9  +  rj)  was  the  measurement  contaminated  by  Gaussian  noise  p  ~ 
jV(0,  a2 1)  and  p  >  0  was  a  parameter  corresponding  to  a.  Throughout  our  tests,  a 
was  known,  and  the  parameter  p  was  set  to  a.  If  p  =  0,  problem  (28)  reduces  to 

min  ||0||i,  subject  to  =  b.  (29) 

o 

The  stopping  tolerance  for  YALL1  was  set  to  10-5  for  noiseless  case  and  10-3  for 
noise  case,  unless  specified  otherwise.  The  maximum  number  of  iterations  was  set  to 
104,  which  is  sufficiently  large  to  make  YALL1  converge  within  the  given  tolerance. 

6.  Numerical  Simulations.  We  compared  the  performance  of  optimized  circulant 
sensing  matrices/operators  to  that  of  random  ones  on  both  synthetic  and  real-world 
data.  In  addition,  we  tested  a  pair  of  circulant  sensing  matrix  <f>  and  dictionary  T 
learned  together  on  real-world  data.  The  tested  sensing  matrices  and  2D  operators 
are  listed  in  Table  1.  To  be  fair,  we  either  take  all  the  compared  sensing  matrices 
and  2D  operators  to  be  real- valued  or  take  all  of  them  to  be  complex-valued  in  the 
same  test.  In  addition,  except  for  the  coupled  learned  circulant  matrix  coupled- 
plus-rand- circ,  all  sensing  matrices  or  2D  operators  were  generated  and  tested  with 
the  same  set  of  synthetic  or  learned  dictionaries. 

As  in  [8],  throughout  our  tests,  all  columns  of  T  and  <f>  were  normalized  to  have 
the  unit  2-norm.  The  normalization  of  T  is  only  for  convenience  while  that  of 
is  critical,  in  particular  when  4>  is  a  Gaussian  randomly  matrix,  for  otherwise  the 
recovery  by  YALL1  or  other  solvers  may  become  instable.  Similarly,  we  normalized 
all  partial  2D  circulant  operators.  We  next  introduce  an  efficient  way  to  normalize 
ID  matrix  PC  and  2D  operator  VqC 2. 
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Table  1.  List  of  tested  sensing  matrices  and  2D  operators* 


name 

type 

real  /  complex 

kernel 

downsampler 

dimension 

rand-circ 

circulant 

complex 

random 

random 

ID 

real-rand-  circ 

circulant 

real 

random 

random 

ID 

Gaussian 

Gaussian 

complex 

— 

random 

ID 

real- Gaussian 

Gaussian 

real 

— 

random 

ID 

opt- circ 

circulant 

complex 

optimized 

random 

ID 

real-opt-circ 

circulant 

real 

optimized 

random 

ID 

opt-  circ-  and- P 

circulant 

complex 

optimized 

optimized 

ID 

real- opt- circ- and- P 

circulant 

real 

optimized 

optimized 

ID 

opt-plus-rand-circ 

circulant 

complex 

optimized+randonh 

random 

ID 

coupled-plus-rand- circ 

circulant 

complex 

optimized+random 

random 

ID 

rand- 2D- circ 

circulant 

complex 

random 

random 

2D 

opt-plus-rand-2D-circ 

circulant 

complex 

optimized+random 

random 

2D 

*The  kernel  v  for  real  random  ID  circulant  was  generated  by  MATLAB  command  randn(n,l) 
and  that  for  complex  random  ID  circulant  by  randn(n,  l)+li*randn(n,  1) ;  real  Gaussian  was 
generated  by  randn(m,n)  and  complex  Gaussian  by  randn(m,n)+li*randn(m,n) ;  the  kernel  M  for 
real  random  2D  circulant  was  generated  by  randn(nl,n2)  and  that  for  complex  random  2D 
circulant  by  randn(nl  ,n2) +li*randn(nl  ,n2) ; 

^60%  of  the  normalized  optimized  circulant  plus  40%  of  a  normalized  real  random  circulant. 


6.1.  Normalization  of  ID  matrix  PC  and  2D  operator  VqC^  Let  C  be 
defined  in  (7)  and  c  =  [tq,  vn,  i?n_i, . . .  ,  iq]7”  be  its  first  column.  It  is  easy  to 
verify  that  the  j th  column  of  C  is  Cj  =  [cn_j+ 2?  cn-j+ 3>  •  •  •  ,  cn,  ci, . . . ,  cn_j+i]T. 
Let  i  —  [£\ , . . .  ,£n]T  with  £j  being  the  Euclidean  norm  of  the  j  th  column  of  PC. 
Recalling  (16),  we  have  for  j  =  1, . . . ,  n  that 

n 

Pj  =  \\PCj\\l  =  Plcl-j+2  +P2C2n_j+. 3  +  .  .  .Pj-lC2n  +Pjc\  +  .  ..pnC2n_j+1  =  Y^Picl.j, 

where  sy  =  mod(n  +  i  —  j,  n)  +  1.  Hence,  £j  =  f°r  j  =  1;  •  •  •  ,  n >  and 

PC(diag(^))-1  has  columns  with  the  unit  2-norm. 

For  P0C2,  recall  (9)  and  (24).  Then  normalizing  VqC2  is  equivalent  to  normal¬ 
izing  DqC2-  Let  n  =  77,177,2  and  £  =  [^1, . . .  ,£n]T  with  £j  being  the  Euclidean  norm 
of  the  jth  column  of  DqC2-  Note  that  for  1  <  t,  r  <  77,1  and  1  <  s,  k  <  77,2, 

(GOw™  =  M%  =  MatTfiaK 

where  its  =  t  +  (s  —  l)ni,  jTK  =  r  +  (/c  — l)ni  and  atT  =  mod(ni+r-t,  ni)  +  l,  /3SK  = 
mod(n2  +  ft  —  5,712)  +  1-  Hence,  for  each  pair  of  (r,  ft),  we  can  compute  £j  with 
j  =  r  +  (ft  —  1)77,1  via 

h  = 

Therefore,  we  normalize  DqC2  to  D^C^diag^))-1  and  P0C2  to  TV1C2A  where 
C(X)  =  X®L  for  any  X  G  CniXn2  with  L  G  CniXn2  obtained  from  £  via£  =  vec(L). 

6.2.  Synthetic  data.  We  tested  different  optimized  circulant  sensing  matrices 
on  synthetic  data  along  with  two  kinds  of  bases  T:  the  Gaussian  random  basis  and 
the  discrete  Fourier  basis.  The  performance  was  compared  to  that  of  unoptimized 
random  circulant,  as  well  as  random  Gaussian,  sensing  matrices.  The  Gaussian 
random  basis  was  generated  by  MATLAB  command  randn(n)  and  then  turned  to 
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an  orthogonal  matrix  by  QR  decomposition,  and  the  Fourier  basis  was  generated 
by  MATLAB  command  dftmtx(n)/sqrt  (n).  Both  bases  were  512  x  512,  and  the 
sensing  matrices  had  the  size  64  x  512,  i.e. ,  an  8x  downsample.  In  this  test,  9 
was  generated  with  k  randomly  located  nonzero  entries  sampled  from  the  standard 
Gaussian  distribution  and  then  normalized  to  have  the  unit  2-norm.  The  stopping 
tolerance  for  YALL1  was  set  to  5  x  10-8.  Figure  2  depicts  the  comparison  results 
of  sensing  matrices:  rand-circ ,  Gaussian ,  opt-circ ,  opt-circ-and-P ,  as  well  as  their 
real- valued  counterparts.  We  call  a  recovery  0 *  successful  if  the  relative  error  \\9*  — 
6\\ 2/ 1 1 6\\ 2  was  below  10-4.  We  calculated  the  success  rate  out  of  50  independent 
trials  at  every  sparsity  level  k. 

All  the  four  sub  figures  of  Figure  2  reveal  that  optimized  circulant  sensing  matri¬ 
ces  led  to  equally  good  performance  as  random  Gaussian  matrices.  For  the  random 
basis,  random  circulant  matrices  achieved  similar  recovery  success  rate,  while  they 
performed  extremely  bad  for  the  discrete  Fourier  basis.  The  reason  for  the  bad 
performace  of  random  circulant  matrices  can  be  found  in  [34].  On  the  other  hand, 
optimizing  the  selection  matrix  P  hardly  made  further  improvement.  We  believe 
that  unless  the  underlying  signal  has  dominant  frequencies,  optimizing  the  selection 
matrix  P  will  not  lead  to  consistent  improvement.  Therefore,  in  the  subsequent 
tests,  we  let  P  be  random  selection  matrices.  In  addition,  the  improvement  by 
optimizing  circulant  sensing  matrices  over  randomly  generated  ones  is  similar  for 
real  and  complex- valued.  For  convenience,  we  use  complex- valued  sensing  matri¬ 
ces/operators  in  the  rest  tests. 

Although  the  tests  above  are  based  on  synthetic  signals  that  are  exactly  sparse, 
similar  performance  of  different  sensing  matrices  was  observed  on  those  that  are 
nearly  sparse.  Since  real  images  are  nearly  sparse  (under  certain  dictionaries),  we 
skip  presenting  the  results  of  nearly  sparse  synthetic  signals  and  proceed  to  the  real 
image  tests. 

6.3.  Image  tests  with  circulant  matrices.  In  this  subsection,  we  present  the 
performance  of  sensing  matrices  rand-circ ,  Gaussian ,  opt- plus -rand-circ  and  coupled- 
plus-rand- circ  on  real  images.3  The  dictionaries  for  all  of  them  were  learned  from 
the  same  training  set  and  had  size  64  x  256,  namely,  we  set  K  =  256.  The  first 
three  sensing  matrices  shared  the  same  set  of  dictionaries  learned  by  KSVD  [1]  with 
sparsity  level  S  =  6,  8  in  (26),  and  the  last  one  was  learned  simultaneously  with 
its  corresponding  dictionary  by  the  coupled  method  described  in  Section  5  with 
sparsity  level  S'  =  6,8  and  a  =  ^  in  (27).  This  choice  of  a  was  recommended 
in  [8].  All  images  used  in  these  tests  were  scaled  to  have  the  unit  maximum  pixel 
value.  The  training  data  consists  of  20,000  8  x  8  patches,  that  are  100  randomly 
extracted  patches  from  each  of  the  200  images  in  the  training  set  of  the  Berkeley 
segmentation  dataset  [19].  The  100  images  in  the  testing  set  were  used  to  measure 
the  recovery  performance. 

In  the  first  set  of  tests,  we  uniformly  randomly  extracted  non-overlapping  600 
patches  from  the  100  test  images  to  recover  using  their  measurements  obtained 
with  the  four  sensing  matrices.  Figure  3  depicts  the  mean  squared  error:  MSE  = 
Eki  Ik  —  T#2  Hi/ (64^),  for  sparsity  level  S  =  6,  8  and  measurement  size  m  =  16,  24. 
Here,  xl  is  the  vector  obtained  by  reshaping  the  ith  selected  patch,  i  =  600  is  the 


3We  also  tested  opt-circ  and  coupled-circ ,  and  found  that  they  performed  as  well  as  opt-plus- 
rand-circ  and  coupled-plus-rand- circ  on  average.  However,  they  tend  to  cause  the  loss  of  small 
image  features  that  may  not  be  well  described  by  the  dictionary. 
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Figure  2.  Comparison  results  of  two  groups  of  sensing  matrices: 
rand-circ ,  Gaussian ,  opt-circ ,  opt-circ-and-P ,  and  their  real  valued 
counterparts  with  Gaussian  random  basis  (left)  and  Fourier  basis 
(right). 


number  of  tested  patches,  and  6l  was  the  YALL1  solution  of 

min\\8\\1  +  ±\\$W-b\\l  (30) 

where  b  =  $>(xl+crri)  and  a  =  o'||^||00/||^7||00  with  77  ~  A/*(0, 1)  and  a  varying  among 
{0,0.01,0.05,0.10,0.15}.  All  results  are  the  averages  of  20  independent  trials. 

All  the  four  pictures  in  Figure  3  reveal  that  both  uncoupled  and  coupled  learning 
approaches  achieved  significantly  better  recovery  over  random  circulant  matrices, 
and  they  did  even  better  than  Gaussian  random  matrices  when  mm  16.  The 
coupled  learning  approach  made  slight  improvement  over  the  uncoupled  one. 

In  the  second  set  of  tests,  we  chose  two  out  of  the  100  test  images  to  recover 
using  their  measurements  obtained  with  the  four  different  sensing  matrices.  The 
first  image  had  the  resolution  of  481  x  321,  and  the  second  one  had  the  resolution 
of  321  x  481.  For  convenience,  we  removed  the  last  row  and  the  last  column  of 
both  the  two  images  and  partitioned  each  images  into  1,200  8  x  8  patches.  Then 
we  used  YALL1  to  solve  (30)  for  each  patch  with  the  dictionary  T  learned  by 
KSVD  with  the  sparsity  level  S  =  6.  We  used  peak-signal-to-noise-ratio  (PSNR) 
and  mean  squared  error  (MSE)  to  measure  the  performance  of  recovery.  Table  2 
lists  the  average  results  of  20  independent  trials  for  measurement  size  m  =  16,  24 
and  noise  level  a  =  0,0.01,0.05,0.10,0.15.  One  set  of  the  recovered  images  are 
shown  in  Figure  4.  From  the  results,  we  can  see  that  both  uncoupled  and  coupled 
learning  approaches  made  significant  improvement  over  random  circulant  matrices, 
especially  for  m  =  16.  When  a  >  0.10  or  m  =  16,  the  coupled  learning  approach 
did  even  better  than  random  Gaussian  matrices.  Coupled  learned  circulant  matrices 
tend  to  do  better  than  Gaussian  random  ones  when  m  is  smaller.  Although  the  best 
PSNRs  were  achieved  with  Gaussian  random  matrices,  the  best  visual  results  are 
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Figure  3.  Comparison  results*  of  four  different  sensing  matrices: 
rand-circ ,  Gaussian ,  opt-plus-rand-circ  and  coupled- plus -rand- circ 
on  Berkeley  segmentation  dataset  for  different  KSVD  sparsity  levels 
S  =  6,  8  and  different  sampled  row  numbers  m  =  16,  24. 


rand-circ 

Gaussian 

r  .  .  J 

[fgHH 

opt-plus-rand-circ 

coupled-plus-rand-circ 

ii  h  lil  111  111 

0  0.01  0.05  0.10  0.15 

noise  level 


(a)  m  =  16,  S  =  6 


t  rand-circ 

r  .j 

i  Gaussian 

L _ j 

|  opt-plus-rand-circ 
|  coupled-plus-rand-circ 

0.01  0.05  0.10 

noise  level 

(b)  m  =  24,  S  =  6 


o.oi 
0.008 
|  0.006 
:  0.004 
0.002 
0 


|  rand-circ 
j  Gaussian 

|  opt-plus-rand-circ 
|  coupled-plus-rand-cir(|; 


III  111  111  111  ll 


0.01  0.05  0.10 

noise  level 


t  rand-circ 

*  | 

i  Gaussian 

|  opt-plus-rand-circ 

|  coupled-plus-rand-circ 

0.01  0.05  o.io 

noise  level 


(c)  m  =  16,  S  =  8  (d)  m  =  24,  S  =  8 

*Some  MSEs  are  out  of  the  display  range  and  they  are  larger  than  0.02. 


those  with  the  coupled  learned  circulant  matrices.  In  addition,  the  coupled  approach 
did  slightly  better  than  the  uncoupled  one. 

6.4.  Image  tests  with  2D  circulant.  For  the  tests  in  Section  6.3,  each  selected 
patch  was  reshaped  to  a  vector  by  stacking  its  columns.  In  this  subsection,  we 
compare  rand-2D-circ  and  opt-plus-rand-2D-circ  to  illustrate  how  to  directly  apply 
2D  circulant  operator  on  the  squared  patches.  The  dictionary  learned  by  KSVD  in 
Section  6.3  with  the  sparsity  level  S  =  6  was  used  in  this  test.  Note  that  each  atom 
in  the  dictionary  becomes  a  squared  patch  as  opposed  to  a  vector.  We  used  the 
same  600  patches  and  the  same  two  images  as  in  Section  6.3  for  this  test.  Instead 
of  solving  (30),  now  YALL1  was  employed  to  solve 

min  ||0||i  +  — 1| VnC2CQ(d)  -  b\\\ ,  (31) 

9  (J 

where  denotes  normalized  partial  2D  circulant  operator,  Q  is  defined  in 

Section  3.2,  b  =  VqC2jC(X1  -\-crp)  with  the  same  p  and  a  to  those  in  Section  6.3,  and 
X1  is  the  ith  tested  patch  corresponding  to  the  reshaped  vector  xl  in  (30).  Figure  5 
plots  the  average  MSEs  of  20  independent  trials  with  the  number  of  measurements 
m  =  16,24  and  noise  level  a  =  0,0.01,0.05,0.10,0.15,  and  Figure  6  plots  one  set 
of  recovered  images  Castle  and  Gulf  by  YALL1  with  the  number  of  measurements 
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Table  2.  Comparison  results  of  recovered  images  Castle  and  Gulf 
by  YALL1  with  four  different  sensing  matrices:  rand-circ ,  Gauss¬ 
ian,  opt-plus-rand-circ ,  coupled-plus-rand- circ 


Castle 

rand 

-circ 

Gaussian 

opt-plus-rand-circ 

coupled-plus-rand- circ 

m 

<T 

PSNR 

MSE 

PSNR 

MSE 

PSNR 

MSE 

PSNR 

MSE 

16 

0.00 

19.41 

4.78e-2 

25.58 

2.96e-3 

25.03 

3.16e-3 

25.31 

2.96e-3 

16 

0.01 

18.61 

4.36e-2 

24.40 

3.85e-3 

24.41 

3.66e-3 

24.72 

3.39e-3 

16 

0.05 

17.87 

5.06e-2 

23.55 

4.67e-3 

23.56 

4.44e-3 

23.95 

4.05e-3 

16 

0.10 

16.88 

5.54e-2 

22.35 

6.24e-3 

22.70 

5.40e-3 

23.05 

4.98e-3 

16 

0.15 

16.22 

5.87e-2 

21.27 

8.03e-3 

22.15 

6.12e-3 

22.42 

5.76e-3 

24 

0.00 

25.66 

8.57e-3 

28.77 

1.33e-3 

27.02 

2.00e-3 

27.43 

1.82e-3 

24 

0.01 

24.05 

8.31e-3 

27.22 

1.91e-3 

26.05 

2.57e-3 

26.50 

2.27e-3 

24 

0.05 

22.07 

1.82e-2 

25.58 

2.79e-3 

24.31 

3.77e-3 

24.76 

3.38e-3 

24 

0.10 

20.15 

3.17e-2 

23.91 

4.11e-3 

22.99 

5.06e-3 

23.36 

4.65e-3 

24 

0.15 

18.82 

4.08e-2 

22.56 

5.62e-3 

22.28 

5.95e-3 

22.53 

5.61e-3 

Gulf 

rand - 

-circ 

Gaussian 

opt-plus-rand-circ 

coupled-plus-rand- circ 

m 

<T 

PSNR 

MSE 

PSNR 

MSE 

PSNR 

MSE 

PSNR 

MSE 

16 

0.00 

19.57 

4.29e-2 

25.64 

2.91e-3 

25.13 

3.11e-3 

25.36 

2.93e-3 

16 

0.01 

18.87 

3.94e-2 

24.48 

3.74e-3 

24.59 

3.52e-3 

24.87 

3.28e-3 

16 

0.05 

18.06 

4.51e-2 

23.65 

4.55e-3 

23.69 

4.32e-3 

24.06 

3.96e-3 

16 

0.10 

17.23 

4.91e-2 

22.51 

5.93e-3 

22.76 

5.35e-3 

23.15 

4.89e-3 

16 

0.15 

16.48 

5.26e-2 

21.39 

7.83e-3 

22.16 

6.12e-3 

22.47 

5.70e-3 

24 

0.00 

25.91 

7.79e-3 

29.06 

1.24e-3 

27.39 

1.85e-3 

27.58 

1.76e-3 

24 

0.01 

24.29 

7.82e-3 

27.55 

1.78e-3 

26.45 

2.34e-3 

26.77 

2.13e-3 

24 

0.05 

22.35 

1.65e-2 

25.81 

2.64e-3 

24.56 

3.57e-3 

25.01 

3.20e-3 

24 

0.10 

20.44 

2.84e-2 

24.08 

3.95e-3 

23.12 

4.93e-3 

23.55 

4.47e-3 

24 

0.15 

19.06 

3.64e-2 

22.73 

5.40e-3 

22.29 

5.94e-3 

22.68 

5.44e-3 

m  =  24  and  noise  level  a  =  0.01.  We  can  see  that  similar  results  were  obtained  as 
in  Section  6.3  where  circulant  sensing  matrices  were  compared. 

6.5.  Image-scale  recovery.  In  previous  tests,  we  divided  an  image  into  a  number 
of  small  patches  and  recovered  each  patch  from  its  linear  measurements,  since  the 
learned  circulant  sensing  matrices/operators  need  to  match  those  dictionary  atoms 
in  size.  In  practice,  we  often  take  measurements  directly  from  the  entire  image 
instead  of  its  small  patches,  so  the  previous  learned  patch-scale  dictionaries  do  not 
match  the  image.  In  this  section,  we  first  use  the  method  discussed  in  section  4  to 
generate  an  image-scale  dictionary  \l/  from  the  patch-scale  one  learned  by  KSVD 
in  section  6.3  with  sparsity  level  S  =  6.  Then  we  use  ^  to  recover  an  image 
X  G  CNM  from  its  measurements  b  =  Q(X  +  arj)  by  solving 

m.m\\e\\1  +  ~\\gQ(e)-b\\l  (32) 

where  Q  =  is  a  normalized  partial  2D  circulant  operator  defined  on  CNl  x  N<2 , 

r]  ~  A /"(0,/)  is  Gaussian  noise,  a  =  <r|| A'||oo/IM|oo,  and  Q  is  a  2D  operator  (corre¬ 
sponding  to  defined  in  the  same  way  as  in  section  3.2. 

At  first  glance,  the  recovery  by  solving  (32)  may  take  a  lot  of  time  since  the 
dictionary  \l/  has  so  many  atoms.  However,  the  sparsity  structure  of  these  atoms 
enables  recoveries  to  be  as  fast  as  those  using  patch-size  dictionaries.  Note  that 
fully  random  sensing  operators  are  out  of  the  game  at  the  image  scale  since  the 


MSE 
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Figure  4.  One  set  of  recovered  images  Castle  (upper  four)  and 
Gulf  (lower  four)  by  YALL1  with  four  different  sensing  matrices: 
rand-circ ,  Gaussian ,  opt-plus-rand-circ ,  coupled-plus-rand- circ  for 
noise  level  <r  =  0.01,  KSVD  sparsity  level  S  =  6  and  selected  row 
number  m  =  24 


rand-circ 

PSNR=25.74 


Gaussian 

PSNR=27.83 


opt-plus-rand-circ  co  up  l  ed-p  lus-rand-circ 

PSNR=27.38  PSNR=27.58 


rand-circ ,  PSNR=25.99 


Gaussian ,  PSNR=28.05 


opt-plus-rand-circ ,  PSNR=27.84 


coupled-plus-rand- circ,  PSNR=27.88 


Figure  5.  Comparison  results*  of  solutions  by  YALL1  with  two 
different  2D  circulant  operators:  rand-2D-circ  and  opt-plus-rand- 
2D-circ  from  m  =  16  (left)  and  m  =  24  (right)  measurements 


MMMI  rand-2D-circ 

Ml  rand-2D-circ 

MBH  opt-plus-rand-2D-circ 

MHH  opt-plus-rand-2D-circ 

Some  MSEs  are  out  of  the  display  range  and  larger  than  0.02  or  even  0.04. 
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Figure  6.  One  set  of  recovered  images  Castle  (left  two)  and  Gulf 
(right  two)  by  YALL1  using  rand-2D-circ  and  opt-plus-rand-2D- 
circ  with  m  —  24  measurements  and  noise  level  a  =  0.01 


rand-2D-circ,  PSNR  =  25.54 


Figure  7.  One  set  of  recovered  images  Plane  (left  two)  and  Gulf 
(right  two)  by  YALL1  with  two  different  2D  circulant  operators: 
rand-2D-circ  and  opt-plus-rand-2D-circ  for  sample  ratio  SR  =  0.30 
and  noise  level  a  =  0.01 


rand-2D-circ  opt-plus-rand-2D-circ  rand-2D-circ  opt-plus-rand-2D-circ 

PSNR  =  29.82  PSNR  =  30.59  PSNR  =  26.59  PSNR  =  27.21 


number  of  free  entries  of  each  is  at  least  ten  thousand  times  of  the  image  resolution, 
making  such  an  operator  impossible  to  fit  in  the  memory  of  a  typical  workstation. 

We  compared  rand-2D-circ  and  opt-plus-rand-2D-circ  on  two  images.  Both  of 
them  had  the  resolution  of  241  x  129,  and  they  were  obtained  by  cropping4  their 
larger  originals  Plane  and  Gulf  in  the  testing  set  of  Berkeley  segmentation  dataset. 
The  noise  level  a  was  set  to  0,  0.01,  0.05,  0.10,  0.15  and  the  sample  ratio  SR  := 
to  0.20,  0.30.  Table  3  lists  the  average  results  of  20  independent  trials,  and  Figure  7 
plots  one  set  of  recovered  images  Plane  (left  two)  and  Gulf  (right  two)  for  a  =  0.01 
and  SR  =  0.30.  From  the  table,  we  can  see  that  the  generated  image-scale  dictionary 
performed  well  in  recovering  the  images  with  sufficiently  many  measurements,  and 
that  the  optimized  circulant  operators  did  significantly  better  than  the  random  ones 
on  average. 


4Though  the  recovery  by  solving  (32)  did  not  take  much  time,  it  took  us  quite  a  long  time  to 
obtain  the  optimized  2D  circulant  operator  by  the  method  discussed  in  section  3.2.  The  learning 
on  larger  images  would  take  much  more  time.  To  save  time,  we  simply  used  the  cropped  images 
to  test  our  idea. 
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Table  3.  Comparison  results  of  recovered  full-size  images  by 
YALL1  with  two  different  2D  circulant  operators:  rand-2D-circ 
and  opt-plus-rand-2D-circ ;  SR=sample  ratio 


Plane 

rand-2D-circ 

opt-plus- 

rand-2D-circ 

Plane 

rand-2D-circ 

opt-plus- 

rand-2D-circ 

SR 

<7 

PSNR 

MSE 

PSNR 

MSE 

SR 

G 

PSNR 

MSE 

PSNR 

MSE 

0.20 

0.00 

22.35 

3.54e-2 

26.87 

2. Ole-3 

0.30 

0.00 

28.35 

3.47e-3 

30.01 

1.00e-3 

0.20 

0.01 

22.50 

3.31e-2 

26.78 

2.10e-3 

0.30 

0.01 

28.57 

3.00e-3 

30.05 

9.93e-4 

0.20 

0.05 

22.11 

3.24e-2 

26.23 

2.38e-3 

0.30 

0.05 

27.73 

3.59e-3 

29.20 

1.21e-3 

0.20 

0.10 

21.16 

3.41e-2 

25.04 

3.14e-3 

0.30 

0.10 

25.99 

5.22e-3 

27.48 

1.80e-3 

0.20 

0.15 

20.12 

3.67e-2 

23.84 

4.13e-3 

0.30 

0.15 

24.38 

7.34e-3 

25.94 

2.56e-3 

Gulf 

rand-2D-circ 

opt-plus- 

rand-2D-circ 

Gulf 

rand-2D-circ 

opt-plus- 

rand-2D-circ 

SR 

G 

PSNR 

MSE 

PSNR 

MSE 

SR 

G 

PSNR 

MSE 

PSNR 

MSE 

0.20 

0.00 

21.33 

1.61e-2 

24.62 

3.46e-3 

0.30 

0.00 

25.58 

5.00e-3 

27.00 

2.00e-3 

0.20 

0.01 

21.38 

1.59e-2 

24.52 

3.53e-3 

0.30 

0.01 

25.66 

4.67e-3 

26.97 

2.02e-3 

0.20 

0.05 

21.19 

1.62e-2 

24.31 

3.71e-3 

0.30 

0.05 

25.32 

4.95e-3 

26.65 

2.17e-3 

0.20 

0.10 

20.68 

1.72e-2 

23.71 

4.26e-3 

0.30 

0.10 

24.34 

6.24e-3 

25.75 

2.67e-3 

0.20 

0.15 

20.02 

1.87e-2 

22.98 

5.04e-3 

0.30 

0.15 

23.28 

7.90e-2 

24.75 

3.37e-3 
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